Tests for Gene Clusters Satisfying the Generalized Adjacency Criterion

نویسندگان

  • Ximing Xu
  • David Sankoff
چکیده

We study a parametrized definition of gene clusters that permits control over the trade-off between increasing gene content versus conserving gene order within a cluster. This is based on the notion of generalized adjacency, which is the property shared by any two genes no farther apart, in the linear order of a chromosome, than a fixed threshold parameter θ. Then a cluster in two or more genomes is just a maximal set of markers, where in each genome these markers form a connected chain of generalized adjacencies. Since even pairs of randomly constructed genomes may have many generalized adjacency clusters in common, we study the statistical properties of generalized adjacency clusters under the null hypothesis that the markers are ordered completely randomly on the genomes. We derive expresions for the exact values of the expected number of clusters of a given size, for large and small values of the parameter. We discover through simulations that the trend from small to large clusters as a function of the parameter theta exhibits a “cut-off” phenomenon at or near √ θ as genome size increases.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Power Boosts for Cluster Tests

Gene cluster significance tests that are based on the number of genes in a cluster in two genomes, and how compactly they are distributed, but not their order, may be made more powerful by the addition of a test component that focuses solely on the similarity of the ordering of the common genes in the clusters in the two genomes. Here we suggest four such tests, compare them, and investigate on...

متن کامل

Graph Clustering by Hierarchical Singular Value Decomposition with Selectable Range for Number of Clusters Members

Graphs have so many applications in real world problems. When we deal with huge volume of data, analyzing data is difficult or sometimes impossible. In big data problems, clustering data is a useful tool for data analysis. Singular value decomposition(SVD) is one of the best algorithms for clustering graph but we do not have any choice to select the number of clusters and the number of members ...

متن کامل

Grouping Objects to Homogeneous Classes Satisfying Requisite Mass

Grouping datasets plays an important role in many scientific researches. Depending on data features and applications, different constrains are imposed on groups, while having groups with similar members is always a main criterion. In this paper, we propose an algorithm for grouping the objects with random labels, nominal features having too many nominal attributes. In addition, the size constra...

متن کامل

Group rings satisfying generalized Engel conditions

Let R be a commutative ring with unity of characteristic r≥0 and G be a locally finite group. For each x and y in the group ring RG define [x,y]=xy-yx and inductively via [x ,_( n+1)  y]=[[x ,_( n)  y]  , y]. In this paper we show that necessary and sufficient conditions for RG to satisfies [x^m(x,y)   ,_( n(x,y))  y]=0 is: 1) if r is a power of a prime p, then G is a locally nilpotent group an...

متن کامل

Generalized Resolution and Minimum Aberration for Nonregular Fractional Factorial Designs

Seeking the optimal design with a given number of runs is a main problem in fractional factorial designs(FFDs). Resolution of a design is the most widely usage criterion, which is introduced by Box and Hunter(1961), used to be employed to regular FFDs. The resolution criterion is extended to non-regular FFG, called the generalized resolution criterion. This criterion is providing the idea of ge...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008